{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "77cd2b62-d55c-48cd-8629-c726b41eafbd",
   "metadata": {},
   "source": [
    "# Function Calling Program for Structured Extraction\n",
    "\n",
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/output_parsing/function_program.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\n",
    "This guide shows you how to do structured data extraction with our `FunctionCallingProgram`. Given a function-calling LLM as well as an output Pydantic class, generate a structured Pydantic object. We use three different function calling LLMs:\n",
    "- OpenAI\n",
    "- Anthropic Claude\n",
    "- Mistral\n",
    "\n",
    "In terms of the target object, you can choose to directly specify `output_cls`, or specify a `PydanticOutputParser` or any other BaseOutputParser that generates a Pydantic object.\n",
    "\n",
    "in the examples below, we show you different ways of extracting into the `Album` object (which can contain a list of Song objects).\n",
    "\n",
    "**NOTE**: The `FunctionCallingProgram` only works with LLMs that natively support function calling, by inserting the schema of the Pydantic object as the \"tool parameters\" for a tool. For all other LLMs, please use our `LLMTextCompletionProgram`, which will directly prompt the model through text to get back a structured output."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3719f4ff-480a-4c39-b897-a37a9c841245",
   "metadata": {},
   "source": [
    "## Define `Album` class\n",
    "\n",
    "This is a simple example of parsing an output into an `Album` schema, which can contain multiple songs.\n",
    "\n",
    "Just pass `Album` into the `output_cls` property on initialization of the `FunctionCallingProgram`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "42273736-e2a1-46b3-8d50-cfe02e00afa5",
   "metadata": {},
   "source": [
    "If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "65a0a440-c8ee-42d8-986b-3b16984baff6",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install llama-index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c8275a82-a581-48c2-9f7b-1f8ac89413e3",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pydantic import BaseModel\n",
    "from typing import List\n",
    "\n",
    "from llama_index.core.program import FunctionCallingProgram"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0a6372f5-8e29-487d-8bcc-f7c2f532e01d",
   "metadata": {},
   "source": [
    "Define output schema"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a3eb837d-326b-45d6-8cfa-d1a7b6bbe56c",
   "metadata": {},
   "outputs": [],
   "source": [
    "class Song(BaseModel):\n",
    "    \"\"\"Data model for a song.\"\"\"\n",
    "\n",
    "    title: str\n",
    "    length_seconds: int\n",
    "\n",
    "\n",
    "class Album(BaseModel):\n",
    "    \"\"\"Data model for an album.\"\"\"\n",
    "\n",
    "    name: str\n",
    "    artist: str\n",
    "    songs: List[Song]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13a01b4a-7fef-429c-8b10-7ec7dc8f3e25",
   "metadata": {},
   "source": [
    "## Define Function Calling Program\n",
    "\n",
    "We define a function calling program with three function-calling LLMs: \n",
    "- OpenAI \n",
    "- Anthropic\n",
    "- Mistral"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3b19fcbf-da16-43c4-9df8-f01810fac7cb",
   "metadata": {},
   "source": [
    "### Function Calling Program with OpenAI\n",
    "\n",
    "Here we use gpt-3.5-turbo.\n",
    "\n",
    "We demonstrate structured data extraction \"single\" function calling and also parallel function calling, allowing us to extract out multiple objects."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a2ca1d28-7e14-4a9d-be51-84110974dc6e",
   "metadata": {},
   "source": [
    "#### Function Calling (Single Object)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5599fdb4-ac24-4925-9ea9-f559b31222a2",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.program import FunctionCallingProgram\n",
    "from llama_index.llms.openai import OpenAI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f8d86c9e-9177-4e47-aa96-3c346d845592",
   "metadata": {},
   "outputs": [],
   "source": [
    "prompt_template_str = \"\"\"\\\n",
    "Generate an example album, with an artist and a list of songs. \\\n",
    "Using the movie {movie_name} as inspiration.\\\n",
    "\"\"\"\n",
    "llm = OpenAI(model=\"gpt-3.5-turbo\")\n",
    "\n",
    "program = FunctionCallingProgram.from_defaults(\n",
    "    output_cls=Album,\n",
    "    prompt_template_str=prompt_template_str,\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "62dedb60-2f34-4086-a2e8-294d64baf154",
   "metadata": {},
   "source": [
    "Run program to get structured output.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d8991fed-467f-4f4d-8bfa-5b3a4782b766",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "=== Calling Function ===\n",
      "Calling function: Album with args: {\"name\": \"The Shining Soundtrack\", \"artist\": \"Various Artists\", \"songs\": [{\"title\": \"Main Title\", \"length_seconds\": 180}, {\"title\": \"Rocky Mountains\", \"length_seconds\": 240}, {\"title\": \"Lullaby\", \"length_seconds\": 200}, {\"title\": \"The Overlook Hotel\", \"length_seconds\": 220}, {\"title\": \"Grady's Story\", \"length_seconds\": 180}, {\"title\": \"The Maze\", \"length_seconds\": 210}]}\n",
      "=== Function Output ===\n",
      "name='The Shining Soundtrack' artist='Various Artists' songs=[Song(title='Main Title', length_seconds=180), Song(title='Rocky Mountains', length_seconds=240), Song(title='Lullaby', length_seconds=200), Song(title='The Overlook Hotel', length_seconds=220), Song(title=\"Grady's Story\", length_seconds=180), Song(title='The Maze', length_seconds=210)]\n"
     ]
    }
   ],
   "source": [
    "output = program(movie_name=\"The Shining\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "56dab4b5-b403-467d-a3d8-f6423f8b39ec",
   "metadata": {},
   "source": [
    "The output is a valid Pydantic object that we can then use to call functions/APIs. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4b39a61a-761b-47b4-a8e7-fb2c0636f805",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Album(name='The Shining Soundtrack', artist='Various Artists', songs=[Song(title='Main Title', length_seconds=180), Song(title='Rocky Mountains', length_seconds=240), Song(title='Lullaby', length_seconds=200), Song(title='The Overlook Hotel', length_seconds=220), Song(title=\"Grady's Story\", length_seconds=180), Song(title='The Maze', length_seconds=210)])"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "output"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "35b98612-089e-45f7-b639-90121c6158eb",
   "metadata": {},
   "source": [
    "#### Function Calling (Parallel Function Calling, Multiple Objects)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "40eea03f-c331-4af5-9f86-a11927325cd2",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "=== Calling Function ===\n",
      "Calling function: Album with args: {\"name\": \"The Shining\", \"artist\": \"Various Artists\", \"songs\": [{\"title\": \"Main Theme\", \"length_seconds\": 180}, {\"title\": \"The Overlook Hotel\", \"length_seconds\": 240}, {\"title\": \"Redrum\", \"length_seconds\": 200}]}\n",
      "=== Function Output ===\n",
      "name='The Shining' artist='Various Artists' songs=[Song(title='Main Theme', length_seconds=180), Song(title='The Overlook Hotel', length_seconds=240), Song(title='Redrum', length_seconds=200)]\n",
      "=== Calling Function ===\n",
      "Calling function: Album with args: {\"name\": \"The Blair Witch Project\", \"artist\": \"Soundtrack Ensemble\", \"songs\": [{\"title\": \"Into the Woods\", \"length_seconds\": 210}, {\"title\": \"The Rustling Leaves\", \"length_seconds\": 180}, {\"title\": \"The Witch's Curse\", \"length_seconds\": 240}]}\n",
      "=== Function Output ===\n",
      "name='The Blair Witch Project' artist='Soundtrack Ensemble' songs=[Song(title='Into the Woods', length_seconds=210), Song(title='The Rustling Leaves', length_seconds=180), Song(title=\"The Witch's Curse\", length_seconds=240)]\n",
      "=== Calling Function ===\n",
      "Calling function: Album with args: {\"name\": \"Saw\", \"artist\": \"Horror Soundscapes\", \"songs\": [{\"title\": \"The Reverse Bear Trap\", \"length_seconds\": 220}, {\"title\": \"Jigsaw's Game\", \"length_seconds\": 260}, {\"title\": \"Bathroom Escape\", \"length_seconds\": 180}]}\n",
      "=== Function Output ===\n",
      "name='Saw' artist='Horror Soundscapes' songs=[Song(title='The Reverse Bear Trap', length_seconds=220), Song(title=\"Jigsaw's Game\", length_seconds=260), Song(title='Bathroom Escape', length_seconds=180)]\n"
     ]
    }
   ],
   "source": [
    "prompt_template_str = \"\"\"\\\n",
    "Generate example albums, with an artist and a list of songs, using each movie below as inspiration. \\\n",
    "\n",
    "Here are the movies:\n",
    "{movie_names}\n",
    "\"\"\"\n",
    "llm = OpenAI(model=\"gpt-3.5-turbo\")\n",
    "\n",
    "program = FunctionCallingProgram.from_defaults(\n",
    "    output_cls=Album,\n",
    "    prompt_template_str=prompt_template_str,\n",
    "    verbose=True,\n",
    "    allow_parallel_tool_calls=True,\n",
    ")\n",
    "output = program(movie_names=\"The Shining, The Blair Witch Project, Saw\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "eb9dcab1-1bf7-4cfb-bd82-eaf71d809f1b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Album(name='The Shining', artist='Various Artists', songs=[Song(title='Main Theme', length_seconds=180), Song(title='The Overlook Hotel', length_seconds=240), Song(title='Redrum', length_seconds=200)]),\n",
       " Album(name='The Blair Witch Project', artist='Soundtrack Ensemble', songs=[Song(title='Into the Woods', length_seconds=210), Song(title='The Rustling Leaves', length_seconds=180), Song(title=\"The Witch's Curse\", length_seconds=240)]),\n",
       " Album(name='Saw', artist='Horror Soundscapes', songs=[Song(title='The Reverse Bear Trap', length_seconds=220), Song(title=\"Jigsaw's Game\", length_seconds=260), Song(title='Bathroom Escape', length_seconds=180)])]"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "output"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8eee2f40-fdbd-4141-b388-d6fb4f178ae4",
   "metadata": {},
   "source": [
    "### Function Calling Program with Anthropic\n",
    "\n",
    "Here we use Claude Sonnet (all three models support function calling)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f114bd12-b109-408f-9281-6bde2a95c162",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.program import FunctionCallingProgram\n",
    "from llama_index.llms.anthropic import Anthropic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6df92b19-eb63-483d-9eff-b1f5b9e17fe4",
   "metadata": {},
   "outputs": [],
   "source": [
    "prompt_template_str = \"Generate a song about {topic}.\"\n",
    "llm = Anthropic(model=\"claude-3-sonnet-20240229\")\n",
    "\n",
    "program = FunctionCallingProgram.from_defaults(\n",
    "    output_cls=Song,\n",
    "    prompt_template_str=prompt_template_str,\n",
    "    llm=llm,\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "79e48c86-0b08-496b-b855-ddffbb1d8e04",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "=== Calling Function ===\n",
      "Calling function: Song with args: {\"title\": \"The Boy Who Lived\", \"length_seconds\": 180}\n",
      "=== Function Output ===\n",
      "title='The Boy Who Lived' length_seconds=180\n"
     ]
    }
   ],
   "source": [
    "output = program(topic=\"harry potter\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8677eedd-cf80-422d-bb64-4fbdb8885967",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Song(title='The Boy Who Lived', length_seconds=180)"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "output"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f93b3cee-8d6e-4d96-97e4-2299d289bc9c",
   "metadata": {},
   "source": [
    "### Function Calling Program with Mistral\n",
    "\n",
    "Here we use mistral-large."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e46076ba-0e95-4f32-a886-220b7ed74309",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.program import FunctionCallingProgram\n",
    "from llama_index.llms.mistralai import MistralAI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c039af17-2e03-4c53-b708-35e79b50a32c",
   "metadata": {},
   "outputs": [],
   "source": [
    "# prompt_template_str = \"\"\"\\\n",
    "# Generate an example album, with an artist and a list of songs. \\\n",
    "# Use the broadway show {broadway_show} as inspiration. \\\n",
    "# Make sure to use the tool.\n",
    "# \"\"\"\n",
    "prompt_template_str = \"Generate a song about {topic}.\"\n",
    "llm = MistralAI(model=\"mistral-large-latest\")\n",
    "program = FunctionCallingProgram.from_defaults(\n",
    "    output_cls=Song,\n",
    "    prompt_template_str=prompt_template_str,\n",
    "    llm=llm,\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5ba59e21-7e30-474c-a457-7c84e2b72292",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "=== Calling Function ===\n",
      "Calling function: Song with args: {\"title\": \"Defying Gravity\", \"length_seconds\": 240}\n",
      "=== Function Output ===\n",
      "title='Defying Gravity' length_seconds=240\n"
     ]
    }
   ],
   "source": [
    "output = program(topic=\"the broadway show Wicked\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "009d870d-247d-4aa5-a0c9-52260f56b631",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Song(title='Defying Gravity', length_seconds=240)"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8a374b38-b151-4ad5-bd06-97c47552bbd3",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.output_parsers import PydanticOutputParser\n",
    "\n",
    "program = LLMTextCompletionProgram.from_defaults(\n",
    "    output_parser=PydanticOutputParser(output_cls=Album),\n",
    "    prompt_template_str=prompt_template_str,\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2459a230-5dfd-4c80-89f4-6bbf175b258f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Album(name='The Fellowship of the Ring', artist='Middle-earth Ensemble', songs=[Song(title='The Shire', length_seconds=240), Song(title='Concerning Hobbits', length_seconds=180), Song(title='The Ring Goes South', length_seconds=300), Song(title='A Knife in the Dark', length_seconds=270), Song(title='Flight to the Ford', length_seconds=210), Song(title='Many Meetings', length_seconds=240), Song(title='The Council of Elrond', length_seconds=330), Song(title='The Great Eye', length_seconds=180), Song(title='The Breaking of the Fellowship', length_seconds=360)])"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "output = program(movie_name=\"Lord of the Rings\")\n",
    "output"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llama_index_v3",
   "language": "python",
   "name": "llama_index_v3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
